Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 8886058 |
| Missing cells | 35271795 |
| Missing cells (%) | 28.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 949.1 MiB |
| Average record size in memory | 112.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 2 |
| DateTime | 1 |
| Categorical | 5 |
Alerts
promo_bin_1
is highly overall correlated with
promo_bin_2 and
1 other fields
|
High correlation |
promo_bin_2
is highly overall correlated with
promo_bin_1 and
3 other fields
|
High correlation |
promo_discount_2
is highly overall correlated with
promo_bin_2 and
2 other fields
|
High correlation |
promo_discount_type_2
is highly overall correlated with
promo_bin_1 and
3 other fields
|
High correlation |
promo_type_2
is highly overall correlated with
promo_bin_2 and
2 other fields
|
High correlation |
revenue
is highly overall correlated with
sales
|
High correlation |
sales
is highly overall correlated with
revenue
|
High correlation |
promo_type_1
is highly imbalanced (77.3%)
|
Imbalance |
promo_type_2
is highly imbalanced (99.1%)
|
Imbalance |
sales
has 302296 (3.4%) missing values
|
Missing |
revenue
has 302296 (3.4%) missing values
|
Missing |
stock
has 302296 (3.4%) missing values
|
Missing |
price
has 91381 (1.0%) missing values
|
Missing |
promo_bin_1
has 7653515 (86.1%) missing values
|
Missing |
promo_bin_2
has 8873337 (99.9%) missing values
|
Missing |
promo_discount_2
has 8873337 (99.9%) missing values
|
Missing |
promo_discount_type_2
has 8873337 (99.9%) missing values
|
Missing |
sales
is highly skewed (γ1 = 1557.844936)
|
Skewed |
revenue
is highly skewed (γ1 = 815.4548181)
|
Skewed |
stock
is highly skewed (γ1 = 24.21927272)
|
Skewed |
Unnamed: 0
is uniformly distributed
|
Uniform |
Unnamed: 0
has unique values
|
Unique |
sales
has 7048907 (79.3%) zeros
|
Zeros |
revenue
has 7049979 (79.3%) zeros
|
Zeros |
Reproduction
| Analysis started | 2024-06-25 14:46:13.167959 |
|---|---|
| Analysis finished | 2024-06-25 15:04:01.734227 |
| Duration | 17 minutes and 48.57 seconds |
| Software version | ydata-profiling v0.0.dev0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ)
UNIFORM
UNIQUE
| Distinct | 8886058 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4443029.5 |
| Minimum | 1 |
|---|---|
| Maximum | 8886058 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 444303.85 |
| Q1 | 2221515.2 |
| median | 4443029.5 |
| Q3 | 6664543.8 |
| 95-th percentile | 8441755.2 |
| Maximum | 8886058 |
| Range | 8886057 |
| Interquartile range (IQR) | 4443028.5 |
Descriptive statistics
| Standard deviation | 2565184.1 |
|---|---|
| Coefficient of variation (CV) | 0.57735024 |
| Kurtosis | -1.2 |
| Mean | 4443029.5 |
| Median Absolute Deviation (MAD) | 2221514.5 |
| Skewness | -1.3646306 × 10 -15 |
| Sum | 3.9481018 × 10 13 |
| Variance | 6.5801696 × 10 12 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | < 0.1% |
| 5924034 | 1 | < 0.1% |
| 5924048 | 1 | < 0.1% |
| 5924047 | 1 | < 0.1% |
| 5924046 | 1 | < 0.1% |
| 5924045 | 1 | < 0.1% |
| 5924044 | 1 | < 0.1% |
| 5924043 | 1 | < 0.1% |
| 5924042 | 1 | < 0.1% |
| 5924041 | 1 | < 0.1% |
| Other values (8886048) | 8886048 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 8886058 | 1 | |
| 8886057 | 1 | |
| 8886056 | 1 | |
| 8886055 | 1 | |
| 8886054 | 1 | |
| 8886053 | 1 | |
| 8886052 | 1 | |
| 8886051 | 1 | |
| 8886050 | 1 | |
| 8886049 | 1 |
store_id
Text
| Distinct | 63 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.8 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 44430290 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | S0002 |
|---|---|
| 2nd row | S0002 |
| 3rd row | S0002 |
| 4th row | S0002 |
| 5th row | S0002 |
| Value | Count | Frequency (%) |
| s0038 | 334082 | 3.8% |
| s0085 | 325409 | 3.7% |
| s0097 | 279019 | 3.1% |
| s0094 | 276217 | 3.1% |
| s0104 | 271338 | 3.1% |
| s0062 | 267921 | 3.0% |
| s0026 | 266261 | 3.0% |
| s0056 | 260416 | 2.9% |
| s0020 | 253996 | 2.9% |
| s0108 | 249346 | 2.8% |
| Other values (53) | 6102053 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 17984981 | |
| S | 8886058 | |
| 1 | 3355402 | 7.6% |
| 2 | 3330114 | 7.5% |
| 5 | 2016937 | 4.5% |
| 8 | 1722869 | 3.9% |
| 6 | 1692194 | 3.8% |
| 3 | 1655994 | 3.7% |
| 4 | 1418789 | 3.2% |
| 9 | 1281762 | 2.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 17984981 | |
| S | 8886058 | |
| 1 | 3355402 | 7.6% |
| 2 | 3330114 | 7.5% |
| 5 | 2016937 | 4.5% |
| 8 | 1722869 | 3.9% |
| 6 | 1692194 | 3.8% |
| 3 | 1655994 | 3.7% |
| 4 | 1418789 | 3.2% |
| 9 | 1281762 | 2.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 17984981 | |
| S | 8886058 | |
| 1 | 3355402 | 7.6% |
| 2 | 3330114 | 7.5% |
| 5 | 2016937 | 4.5% |
| 8 | 1722869 | 3.9% |
| 6 | 1692194 | 3.8% |
| 3 | 1655994 | 3.7% |
| 4 | 1418789 | 3.2% |
| 9 | 1281762 | 2.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 17984981 | |
| S | 8886058 | |
| 1 | 3355402 | 7.6% |
| 2 | 3330114 | 7.5% |
| 5 | 2016937 | 4.5% |
| 8 | 1722869 | 3.9% |
| 6 | 1692194 | 3.8% |
| 3 | 1655994 | 3.7% |
| 4 | 1418789 | 3.2% |
| 9 | 1281762 | 2.9% |
product_id
Text
| Distinct | 615 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.8 MiB |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 5 |
| Min length | 5 |
Characters and Unicode
| Total characters | 44430290 |
|---|---|
| Distinct characters | 11 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | P0001 |
|---|---|
| 2nd row | P0005 |
| 3rd row | P0011 |
| 4th row | P0015 |
| 5th row | P0017 |
| Value | Count | Frequency (%) |
| p0664 | 59051 | 0.7% |
| p0125 | 58708 | 0.7% |
| p0261 | 58504 | 0.7% |
| p0364 | 58428 | 0.7% |
| p0131 | 58117 | 0.7% |
| p0694 | 57956 | 0.7% |
| p0116 | 57940 | 0.7% |
| p0390 | 57872 | 0.7% |
| p0372 | 57699 | 0.6% |
| p0333 | 57473 | 0.6% |
| Other values (605) | 8304310 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 11721008 | |
| P | 8886058 | |
| 1 | 3416438 | 7.7% |
| 6 | 3077549 | 6.9% |
| 4 | 2994432 | 6.7% |
| 5 | 2974769 | 6.7% |
| 2 | 2926654 | 6.6% |
| 3 | 2851268 | 6.4% |
| 7 | 2273576 | 5.1% |
| 9 | 1760049 | 4.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 11721008 | |
| P | 8886058 | |
| 1 | 3416438 | 7.7% |
| 6 | 3077549 | 6.9% |
| 4 | 2994432 | 6.7% |
| 5 | 2974769 | 6.7% |
| 2 | 2926654 | 6.6% |
| 3 | 2851268 | 6.4% |
| 7 | 2273576 | 5.1% |
| 9 | 1760049 | 4.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 11721008 | |
| P | 8886058 | |
| 1 | 3416438 | 7.7% |
| 6 | 3077549 | 6.9% |
| 4 | 2994432 | 6.7% |
| 5 | 2974769 | 6.7% |
| 2 | 2926654 | 6.6% |
| 3 | 2851268 | 6.4% |
| 7 | 2273576 | 5.1% |
| 9 | 1760049 | 4.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 44430290 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 11721008 | |
| P | 8886058 | |
| 1 | 3416438 | 7.7% |
| 6 | 3077549 | 6.9% |
| 4 | 2994432 | 6.7% |
| 5 | 2974769 | 6.7% |
| 2 | 2926654 | 6.6% |
| 3 | 2851268 | 6.4% |
| 7 | 2273576 | 5.1% |
| 9 | 1760049 | 4.0% |
date
Date
| Distinct | 1033 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.8 MiB |
| Minimum | 2017-01-02 00:00:00 |
|---|---|
| Maximum | 2019-10-31 00:00:00 |
sales
Real number (ℝ)
HIGH CORRELATION
MISSING
SKEWED
ZEROS
| Distinct | 5435 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 302296 |
| Missing (%) | 3.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.47340804 |
| Minimum | 0 |
|---|---|
| Maximum | 43301 |
| Zeros | 7048907 |
| Zeros (%) | 79.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 43301 |
| Range | 43301 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 21.290586 |
|---|---|
| Coefficient of variation (CV) | 44.973012 |
| Kurtosis | 2698722.2 |
| Mean | 0.47340804 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1557.8449 |
| Sum | 4063622 |
| Variance | 453.28904 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7048907 | |
| 1 | 848271 | 9.5% |
| 2 | 298996 | 3.4% |
| 3 | 130620 | 1.5% |
| 4 | 73361 | 0.8% |
| 5 | 43412 | 0.5% |
| 6 | 30100 | 0.3% |
| 7 | 19260 | 0.2% |
| 8 | 14027 | 0.2% |
| 9 | 10168 | 0.1% |
| Other values (5425) | 66640 | 0.7% |
| (Missing) | 302296 | 3.4% |
| Value | Count | Frequency (%) |
| 0 | 7048907 | |
| 0.018 | 1 | < 0.1% |
| 0.022 | 1 | < 0.1% |
| 0.024 | 1 | < 0.1% |
| 0.03 | 1 | < 0.1% |
| 0.032 | 1 | < 0.1% |
| 0.034 | 1 | < 0.1% |
| 0.038 | 1 | < 0.1% |
| 0.042 | 1 | < 0.1% |
| 0.044 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 43301 | 1 | < 0.1% |
| 27656 | 1 | < 0.1% |
| 27652 | 1 | < 0.1% |
| 13828 | 1 | < 0.1% |
| 13826 | 1 | < 0.1% |
| 6408 | 1 | < 0.1% |
| 1801 | 1 | < 0.1% |
| 1720 | 1 | < 0.1% |
| 1000 | 3 | |
| 816 | 1 | < 0.1% |
revenue
Real number (ℝ)
HIGH CORRELATION
MISSING
SKEWED
ZEROS
| Distinct | 12155 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 302296 |
| Missing (%) | 3.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.285173 |
| Minimum | 0 |
|---|---|
| Maximum | 84197.961 |
| Zeros | 7049979 |
| Zeros (%) | 79.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 11.76 |
| Maximum | 84197.961 |
| Range | 84197.961 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 54.06806 |
|---|---|
| Coefficient of variation (CV) | 23.66038 |
| Kurtosis | 966651.01 |
| Mean | 2.285173 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 815.45482 |
| Sum | 19615381 |
| Variance | 2923.3551 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7049979 | |
| 0.93 | 33675 | 0.4% |
| 3.24 | 27591 | 0.3% |
| 1.85 | 25154 | 0.3% |
| 2.31 | 23568 | 0.3% |
| 2.78 | 23341 | 0.3% |
| 2.73 | 18024 | 0.2% |
| 1.39 | 17439 | 0.2% |
| 1.16 | 16214 | 0.2% |
| 3.66 | 15914 | 0.2% |
| Other values (12145) | 1332863 | 15.0% |
| (Missing) | 302296 | 3.4% |
| Value | Count | Frequency (%) |
| 0 | 7049979 | |
| 0.01 | 158 | < 0.1% |
| 0.02 | 16 | < 0.1% |
| 0.03 | 10 | < 0.1% |
| 0.05 | 2 | < 0.1% |
| 0.06 | 1 | < 0.1% |
| 0.1 | 1 | < 0.1% |
| 0.23 | 537 | < 0.1% |
| 0.25 | 1 | < 0.1% |
| 0.27 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 84197.961 | 1 | |
| 52496.852 | 1 | |
| 52488.699 | 1 | |
| 32490.51 | 1 | |
| 31150 | 1 | |
| 30327.01 | 1 | |
| 26711.859 | 1 | |
| 26247.59 | 1 | |
| 26243.801 | 1 | |
| 25423.73 | 1 |
stock
Real number (ℝ)
MISSING
SKEWED
| Distinct | 9039 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 302296 |
| Missing (%) | 3.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16.005747 |
| Minimum | 0 |
|---|---|
| Maximum | 4655 |
| Zeros | 66086 |
| Zeros (%) | 0.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 8 |
| Q3 | 17 |
| 95-th percentile | 48 |
| Maximum | 4655 |
| Range | 4655 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 37.516921 |
|---|---|
| Coefficient of variation (CV) | 2.3439656 |
| Kurtosis | 1418.7495 |
| Mean | 16.005747 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 24.219273 |
| Sum | 1.3738952 × 10 8 |
| Variance | 1407.5194 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 619203 | 7.0% |
| 3 | 617176 | 6.9% |
| 6 | 600701 | 6.8% |
| 2 | 585179 | 6.6% |
| 5 | 571932 | 6.4% |
| 1 | 464188 | 5.2% |
| 7 | 435351 | 4.9% |
| 8 | 387802 | 4.4% |
| 9 | 354054 | 4.0% |
| 12 | 353011 | 4.0% |
| Other values (9029) | 3595165 |
| Value | Count | Frequency (%) |
| 0 | 66086 | |
| 0.001 | 38 | < 0.1% |
| 0.002 | 53 | < 0.1% |
| 0.003 | 65 | < 0.1% |
| 0.004 | 323 | < 0.1% |
| 0.005 | 333 | < 0.1% |
| 0.006 | 30 | < 0.1% |
| 0.007 | 15 | < 0.1% |
| 0.008 | 25 | < 0.1% |
| 0.009 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 4655 | 1 | |
| 4582 | 1 | |
| 4473 | 1 | |
| 4404 | 1 | |
| 4384 | 1 | |
| 4320 | 1 | |
| 4308 | 1 | |
| 4292 | 1 | |
| 4273 | 1 | |
| 4243 | 1 |
price
Real number (ℝ)
MISSING
| Distinct | 606 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 91381 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.753767 |
| Minimum | 0.01 |
|---|---|
| Maximum | 1599 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 0.01 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3.45 |
| median | 8 |
| Q3 | 16.95 |
| 95-th percentile | 53.9 |
| Maximum | 1599 |
| Range | 1598.99 |
| Interquartile range (IQR) | 13.5 |
Descriptive statistics
| Standard deviation | 32.77869 |
|---|---|
| Coefficient of variation (CV) | 2.0806891 |
| Kurtosis | 521.81016 |
| Mean | 15.753767 |
| Median Absolute Deviation (MAD) | 5.5 |
| Skewness | 16.550523 |
| Sum | 1.3854929 × 10 8 |
| Variance | 1074.4425 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 243419 | 2.7% |
| 3.95 | 156115 | 1.8% |
| 3.5 | 147047 | 1.7% |
| 0.75 | 146503 | 1.6% |
| 2.95 | 137147 | 1.5% |
| 19.9 | 133650 | 1.5% |
| 11.9 | 131649 | 1.5% |
| 1.75 | 116461 | 1.3% |
| 12.9 | 115990 | 1.3% |
| 2.5 | 109490 | 1.2% |
| Other values (596) | 7357206 |
| Value | Count | Frequency (%) |
| 0.01 | 136 | < 0.1% |
| 0.25 | 7983 | 0.1% |
| 0.3 | 107 | < 0.1% |
| 0.35 | 237 | < 0.1% |
| 0.4 | 1175 | < 0.1% |
| 0.45 | 12344 | 0.1% |
| 0.5 | 45814 | |
| 0.58 | 512 | < 0.1% |
| 0.6 | 24658 | |
| 0.65 | 50459 |
| Value | Count | Frequency (%) |
| 1599 | 174 | < 0.1% |
| 1549 | 115 | < 0.1% |
| 1499 | 160 | < 0.1% |
| 1449 | 95 | < 0.1% |
| 1399 | 127 | < 0.1% |
| 1349 | 174 | < 0.1% |
| 849.9 | 574 | |
| 749.9 | 8 | < 0.1% |
| 749 | 63 | < 0.1% |
| 699.9 | 19 | < 0.1% |
promo_type_1
Categorical
IMBALANCE
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.8 MiB |
| PR14 | |
|---|---|
| PR05 | 547253 |
| PR10 | 213664 |
| PR03 | 151863 |
| PR06 | 124289 |
| Other values (12) | 195474 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 35544232 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR14 |
|---|---|
| 2nd row | PR14 |
| 3rd row | PR14 |
| 4th row | PR14 |
| 5th row | PR14 |
Common Values
| Value | Count | Frequency (%) |
| PR14 | 7653515 | |
| PR05 | 547253 | 6.2% |
| PR10 | 213664 | 2.4% |
| PR03 | 151863 | 1.7% |
| PR06 | 124289 | 1.4% |
| PR07 | 57419 | 0.6% |
| PR12 | 40840 | 0.5% |
| PR09 | 35752 | 0.4% |
| PR17 | 32863 | 0.4% |
| PR01 | 12618 | 0.1% |
| Other values (7) | 15982 | 0.2% |
Length
| Value | Count | Frequency (%) |
| pr14 | 7653515 | |
| pr05 | 547253 | 6.2% |
| pr10 | 213664 | 2.4% |
| pr03 | 151863 | 1.7% |
| pr06 | 124289 | 1.4% |
| pr07 | 57419 | 0.6% |
| pr12 | 40840 | 0.5% |
| pr09 | 35752 | 0.4% |
| pr17 | 32863 | 0.4% |
| pr01 | 12618 | 0.1% |
| Other values (7) | 15982 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 1 | 7966930 | |
| 4 | 7656898 | |
| 0 | 1150417 | 3.2% |
| 5 | 547272 | 1.5% |
| 3 | 152470 | 0.4% |
| 6 | 125201 | 0.4% |
| 7 | 90282 | 0.3% |
| 2 | 40840 | 0.1% |
| Other values (2) | 41806 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 1 | 7966930 | |
| 4 | 7656898 | |
| 0 | 1150417 | 3.2% |
| 5 | 547272 | 1.5% |
| 3 | 152470 | 0.4% |
| 6 | 125201 | 0.4% |
| 7 | 90282 | 0.3% |
| 2 | 40840 | 0.1% |
| Other values (2) | 41806 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 1 | 7966930 | |
| 4 | 7656898 | |
| 0 | 1150417 | 3.2% |
| 5 | 547272 | 1.5% |
| 3 | 152470 | 0.4% |
| 6 | 125201 | 0.4% |
| 7 | 90282 | 0.3% |
| 2 | 40840 | 0.1% |
| Other values (2) | 41806 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 1 | 7966930 | |
| 4 | 7656898 | |
| 0 | 1150417 | 3.2% |
| 5 | 547272 | 1.5% |
| 3 | 152470 | 0.4% |
| 6 | 125201 | 0.4% |
| 7 | 90282 | 0.3% |
| 2 | 40840 | 0.1% |
| Other values (2) | 41806 | 0.1% |
promo_bin_1
Categorical
HIGH CORRELATION
MISSING
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 7653515 |
| Missing (%) | 86.1% |
| Memory size | 67.8 MiB |
| verylow | |
|---|---|
| low | |
| moderate | |
| high | |
| veryhigh |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.0572256 |
| Min length | 3 |
Characters and Unicode
| Total characters | 7465791 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | verylow |
|---|---|
| 2nd row | moderate |
| 3rd row | low |
| 4th row | high |
| 5th row | low |
Common Values
| Value | Count | Frequency (%) |
| verylow | 514398 | 5.8% |
| low | 259135 | 2.9% |
| moderate | 193475 | 2.2% |
| high | 146120 | 1.6% |
| veryhigh | 119415 | 1.3% |
| (Missing) | 7653515 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| verylow | 514398 | |
| low | 259135 | |
| moderate | 193475 | 15.7% |
| high | 146120 | 11.9% |
| veryhigh | 119415 | 9.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1020763 | |
| o | 967008 | |
| r | 827288 | |
| l | 773533 | |
| w | 773533 | |
| v | 633813 | |
| y | 633813 | |
| h | 531070 | |
| i | 265535 | 3.6% |
| g | 265535 | 3.6% |
| Other values (4) | 773900 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 7465791 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 1020763 | |
| o | 967008 | |
| r | 827288 | |
| l | 773533 | |
| w | 773533 | |
| v | 633813 | |
| y | 633813 | |
| h | 531070 | |
| i | 265535 | 3.6% |
| g | 265535 | 3.6% |
| Other values (4) | 773900 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 7465791 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 1020763 | |
| o | 967008 | |
| r | 827288 | |
| l | 773533 | |
| w | 773533 | |
| v | 633813 | |
| y | 633813 | |
| h | 531070 | |
| i | 265535 | 3.6% |
| g | 265535 | 3.6% |
| Other values (4) | 773900 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 7465791 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 1020763 | |
| o | 967008 | |
| r | 827288 | |
| l | 773533 | |
| w | 773533 | |
| v | 633813 | |
| y | 633813 | |
| h | 531070 | |
| i | 265535 | 3.6% |
| g | 265535 | 3.6% |
| Other values (4) | 773900 |
promo_type_2
Categorical
HIGH CORRELATION
IMBALANCE
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 67.8 MiB |
| PR03 | |
|---|---|
| PR02 | 7026 |
| PR04 | 2892 |
| PR01 | 2803 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 35544232 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR03 |
|---|---|
| 2nd row | PR03 |
| 3rd row | PR03 |
| 4th row | PR03 |
| 5th row | PR03 |
Common Values
| Value | Count | Frequency (%) |
| PR03 | 8873337 | |
| PR02 | 7026 | 0.1% |
| PR04 | 2892 | < 0.1% |
| PR01 | 2803 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pr03 | 8873337 | |
| pr02 | 7026 | 0.1% |
| pr04 | 2892 | < 0.1% |
| pr01 | 2803 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 0 | 8886058 | |
| 3 | 8873337 | |
| 2 | 7026 | < 0.1% |
| 4 | 2892 | < 0.1% |
| 1 | 2803 | < 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 0 | 8886058 | |
| 3 | 8873337 | |
| 2 | 7026 | < 0.1% |
| 4 | 2892 | < 0.1% |
| 1 | 2803 | < 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 0 | 8886058 | |
| 3 | 8873337 | |
| 2 | 7026 | < 0.1% |
| 4 | 2892 | < 0.1% |
| 1 | 2803 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 35544232 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 8886058 | |
| R | 8886058 | |
| 0 | 8886058 | |
| 3 | 8873337 | |
| 2 | 7026 | < 0.1% |
| 4 | 2892 | < 0.1% |
| 1 | 2803 | < 0.1% |
promo_bin_2
Categorical
HIGH CORRELATION
MISSING
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8873337 |
| Missing (%) | 99.9% |
| Memory size | 67.8 MiB |
| verylow | |
|---|---|
| high | |
| veryhigh |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.3500511 |
| Min length | 4 |
Characters and Unicode
| Total characters | 80779 |
|---|---|
| Distinct characters | 10 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | verylow |
|---|---|
| 2nd row | verylow |
| 3rd row | verylow |
| 4th row | verylow |
| 5th row | verylow |
Common Values
| Value | Count | Frequency (%) |
| verylow | 6441 | 0.1% |
| high | 3637 | < 0.1% |
| veryhigh | 2643 | < 0.1% |
| (Missing) | 8873337 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| verylow | 6441 | |
| high | 3637 | |
| veryhigh | 2643 |
Most occurring characters
| Value | Count | Frequency (%) |
| h | 12560 | |
| v | 9084 | |
| e | 9084 | |
| r | 9084 | |
| y | 9084 | |
| l | 6441 | |
| o | 6441 | |
| w | 6441 | |
| i | 6280 | |
| g | 6280 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 80779 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| h | 12560 | |
| v | 9084 | |
| e | 9084 | |
| r | 9084 | |
| y | 9084 | |
| l | 6441 | |
| o | 6441 | |
| w | 6441 | |
| i | 6280 | |
| g | 6280 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 80779 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| h | 12560 | |
| v | 9084 | |
| e | 9084 | |
| r | 9084 | |
| y | 9084 | |
| l | 6441 | |
| o | 6441 | |
| w | 6441 | |
| i | 6280 | |
| g | 6280 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 80779 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| h | 12560 | |
| v | 9084 | |
| e | 9084 | |
| r | 9084 | |
| y | 9084 | |
| l | 6441 | |
| o | 6441 | |
| w | 6441 | |
| i | 6280 | |
| g | 6280 |
promo_discount_2
Real number (ℝ)
HIGH CORRELATION
MISSING
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8873337 |
| Missing (%) | 99.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 30.110605 |
| Minimum | 16 |
|---|---|
| Maximum | 50 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 67.8 MiB |
Quantile statistics
| Minimum | 16 |
|---|---|
| 5-th percentile | 20 |
| Q1 | 20 |
| median | 20 |
| Q3 | 35 |
| 95-th percentile | 50 |
| Maximum | 50 |
| Range | 34 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 11.8509 |
|---|---|
| Coefficient of variation (CV) | 0.39357893 |
| Kurtosis | -1.0464257 |
| Mean | 30.110605 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 0.66762654 |
| Sum | 383037 |
| Variance | 140.44382 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 20 | 6226 | 0.1% |
| 33 | 2804 | < 0.1% |
| 50 | 2643 | < 0.1% |
| 35 | 585 | < 0.1% |
| 40 | 248 | < 0.1% |
| 16 | 215 | < 0.1% |
| (Missing) | 8873337 |
| Value | Count | Frequency (%) |
| 16 | 215 | < 0.1% |
| 20 | 6226 | |
| 33 | 2804 | |
| 35 | 585 | < 0.1% |
| 40 | 248 | < 0.1% |
| 50 | 2643 |
| Value | Count | Frequency (%) |
| 50 | 2643 | |
| 40 | 248 | < 0.1% |
| 35 | 585 | < 0.1% |
| 33 | 2804 | |
| 20 | 6226 | |
| 16 | 215 | < 0.1% |
promo_discount_type_2
Categorical
HIGH CORRELATION
MISSING
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 8873337 |
| Missing (%) | 99.9% |
| Memory size | 67.8 MiB |
| PR01 | |
|---|---|
| PR02 | |
| PR04 | |
| PR03 |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 50884 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR04 |
|---|---|
| 2nd row | PR02 |
| 3rd row | PR04 |
| 4th row | PR02 |
| 5th row | PR02 |
Common Values
| Value | Count | Frequency (%) |
| PR01 | 3762 | < 0.1% |
| PR02 | 3648 | < 0.1% |
| PR04 | 2793 | < 0.1% |
| PR03 | 2518 | < 0.1% |
| (Missing) | 8873337 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| pr01 | 3762 | |
| pr02 | 3648 | |
| pr04 | 2793 | |
| pr03 | 2518 |
Most occurring characters
| Value | Count | Frequency (%) |
| P | 12721 | |
| R | 12721 | |
| 0 | 12721 | |
| 1 | 3762 | 7.4% |
| 2 | 3648 | 7.2% |
| 4 | 2793 | 5.5% |
| 3 | 2518 | 4.9% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 50884 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| P | 12721 | |
| R | 12721 | |
| 0 | 12721 | |
| 1 | 3762 | 7.4% |
| 2 | 3648 | 7.2% |
| 4 | 2793 | 5.5% |
| 3 | 2518 | 4.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 50884 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| P | 12721 | |
| R | 12721 | |
| 0 | 12721 | |
| 1 | 3762 | 7.4% |
| 2 | 3648 | 7.2% |
| 4 | 2793 | 5.5% |
| 3 | 2518 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 50884 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| P | 12721 | |
| R | 12721 | |
| 0 | 12721 | |
| 1 | 3762 | 7.4% |
| 2 | 3648 | 7.2% |
| 4 | 2793 | 5.5% |
| 3 | 2518 | 4.9% |
| Unnamed: 0 | price | promo_bin_1 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | promo_type_1 | promo_type_2 | revenue | sales | stock | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Unnamed: 0 | 1.000 | 0.025 | 0.021 | 0.412 | 0.150 | 0.309 | 0.013 | 0.031 | -0.039 | -0.039 | -0.011 |
| price | 0.025 | 1.000 | 0.043 | 0.078 | -0.138 | 0.100 | 0.064 | 0.002 | -0.208 | -0.252 | -0.360 |
| promo_bin_1 | 0.021 | 0.043 | 1.000 | 0.872 | -0.311 | 0.832 | 0.430 | 0.029 | 0.059 | 0.062 | 0.088 |
| promo_bin_2 | 0.412 | 0.078 | 0.872 | 1.000 | -0.778 | 0.904 | 0.303 | 0.885 | -0.010 | -0.014 | -0.088 |
| promo_discount_2 | 0.150 | -0.138 | -0.311 | -0.778 | 1.000 | 0.777 | 0.348 | 0.985 | 0.053 | 0.054 | 0.031 |
| promo_discount_type_2 | 0.309 | 0.100 | 0.832 | 0.904 | 0.777 | 1.000 | 0.265 | 0.896 | 0.220 | 0.237 | 0.215 |
| promo_type_1 | 0.013 | 0.064 | 0.430 | 0.303 | 0.348 | 0.265 | 1.000 | 0.013 | -0.037 | -0.031 | 0.002 |
| promo_type_2 | 0.031 | 0.002 | 0.029 | 0.885 | 0.985 | 0.896 | 0.013 | 1.000 | -0.004 | -0.003 | -0.000 |
| revenue | -0.039 | -0.208 | 0.059 | -0.010 | 0.053 | 0.220 | -0.037 | -0.004 | 1.000 | 0.992 | 0.184 |
| sales | -0.039 | -0.252 | 0.062 | -0.014 | 0.054 | 0.237 | -0.031 | -0.003 | 0.992 | 1.000 | 0.202 |
| stock | -0.011 | -0.360 | 0.088 | -0.088 | 0.031 | 0.215 | 0.002 | -0.000 | 0.184 | 0.202 | 1.000 |
| Unnamed: 0 | store_id | product_id | date | sales | revenue | stock | price | promo_type_1 | promo_bin_1 | promo_type_2 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | S0002 | P0001 | 2017-01-02 | 0.0 | 0.00 | 8.0 | 6.25 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 1 | 2 | S0002 | P0005 | 2017-01-02 | 0.0 | 0.00 | 11.0 | 33.90 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 2 | 3 | S0002 | P0011 | 2017-01-02 | 0.0 | 0.00 | 9.0 | 49.90 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 3 | 4 | S0002 | P0015 | 2017-01-02 | 1.0 | 2.41 | 19.0 | 2.60 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 4 | 5 | S0002 | P0017 | 2017-01-02 | 0.0 | 0.00 | 12.0 | 1.49 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 5 | 6 | S0002 | P0018 | 2017-01-02 | 1.0 | 1.81 | 37.0 | 1.95 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 6 | 7 | S0002 | P0024 | 2017-01-02 | 0.0 | 0.00 | 36.0 | 1.95 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 7 | 8 | S0002 | P0035 | 2017-01-02 | 2.0 | 4.54 | 15.0 | 2.45 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8 | 9 | S0002 | P0046 | 2017-01-02 | 0.0 | 0.00 | 11.0 | 34.50 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 9 | 10 | S0002 | P0051 | 2017-01-02 | 7.0 | 4.54 | 132.0 | 0.70 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| Unnamed: 0 | store_id | product_id | date | sales | revenue | stock | price | promo_type_1 | promo_bin_1 | promo_type_2 | promo_bin_2 | promo_discount_2 | promo_discount_type_2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8886048 | 8886049 | S0143 | P0639 | 2019-10-31 | NaN | NaN | NaN | 9.75 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886049 | 8886050 | S0143 | P0642 | 2019-10-31 | NaN | NaN | NaN | 4.00 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886050 | 8886051 | S0143 | P0658 | 2019-10-31 | NaN | NaN | NaN | 41.50 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886051 | 8886052 | S0143 | P0663 | 2019-10-31 | NaN | NaN | NaN | 6.75 | PR10 | verylow | PR03 | NaN | NaN | NaN |
| 8886052 | 8886053 | S0143 | P0664 | 2019-10-31 | NaN | NaN | NaN | 1.75 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886053 | 8886054 | S0143 | P0676 | 2019-10-31 | NaN | NaN | NaN | 19.90 | PR03 | verylow | PR03 | NaN | NaN | NaN |
| 8886054 | 8886055 | S0143 | P0680 | 2019-10-31 | NaN | NaN | NaN | 139.90 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886055 | 8886056 | S0143 | P0694 | 2019-10-31 | NaN | NaN | NaN | 7.50 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886056 | 8886057 | S0143 | P0718 | 2019-10-31 | NaN | NaN | NaN | 23.75 | PR14 | NaN | PR03 | NaN | NaN | NaN |
| 8886057 | 8886058 | S0143 | P0747 | 2019-10-31 | NaN | NaN | NaN | 21.90 | PR14 | NaN | PR03 | NaN | NaN | NaN |